Initial evaluation of hidden dynamic models on conversational speech
نویسندگان
چکیده
Conversational speech recognition is a challenging problem primarily because speakers rarely fully articulate sounds. A successful speech recognition approach must infer intended spectral targets from the speech data, or develop a method of dealing with large variances in the data. Hidden Dynamic Models (HDMs) attempt to automatically learn such targets in a hidden feature space using models that integrate linguistic information with constrained temporal trajectory models. HDMs are a radical departure from conventional hidden Markov models (HMMs), which simply account for variation in the observed data. In this paper, we present an initial evaluation of such models on a conversational speech recognition task involving a subset of the SWITCHBOARD corpus. We show that in an N-Best rescoring paradigm, HDMs are capable of delivering performance competitive with HMMs.
منابع مشابه
Efficient decoding strategy for conversational speech recognition using state-space models for vocal-tract-resonance dynamics
In this paper, we present an efficient strategy for likelihood computation and decoding in a continuous speech recognizer using underlying state-space dynamic models for the hidden speech dynamics. The state-space models have been constructed in a special way so as to be suitable for the conversational or casual style of speech where phonetic reduction abounds. The interacting multiple model (I...
متن کاملThe AT&t large vocabulary conversational speech recognition system
We describe the AT&T recognition system used in the DARPA Large Vocabulary Conversational Speech Recognition (LVCSR98) evaluation. It is based on multi-pass rescoring of weighted Finite State Machines (FSMs) using progressively more accurate acoustic models. Acoustic models used in the system are all gender independent. They are based on three state contextdependent hidden Markov models using G...
متن کاملOptimization of dynamic regimes in a statistical hidden dynamic model for conversational speech recognition
This paper reports our on-going work aimimg to improve the performance of a novel speech recognizer based on an underlying statistical hidden dynamic model of phonetic reduction in the production of conversational speech. We have developed a path-stack search algorithm which e ciently computes the likelihood of any observation utterance while optimizing the dynamic regimes in the speech model. ...
متن کاملLarge Scale Mmie Training for Conversational Telephone Speech Recognition
This paper describes a lattice-based framework for maximum mutual information estimation (MMIE) of HMM parameters which has been used to train HMM systems for conversational telephone speech transcription using up to 265 hours of training data. These experiments represent the largest-scale application of discriminative training techniques for speech recognition of which the authors are aware, a...
متن کاملA path-stack algorithm for optimizing dynamic regimes in a statistical hidden dynamic model of speech
In this paper we report our recent research whose goal is to improve the performance of a novel speech recognizer based on an underlying statistical hidden dynamic model of phonetic reduction in the production of conversational speech. We have developed a path-stack search algorithm which efficiently computes the likelihood of any observation utterance while optimizing the dynamic regimes in th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999